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Abstract. I n thi s paper, we wi 1 1 demonstrate that the automati c i nterpreta- 
tion of a single photograph of an evacuation plan can be used for the recon- 
struction of a coarse indoor model. I n addition to the coarse model, these 
photographs may also provide the user with two essential values for posi- 
tioning approaches delivering only relative coordinates. Such a positioning 
approach, in turn, can be used to refine the coarse model eg. by automatic 
reconstruction of door openings. To further robustify our approach, object 
knowledge will be introduced by means of a formal grammar for indoor 
modeling. 
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1. Introduction 

Due to the broad avail ability of a positioning device (GPS) and digital maps 
the navigation problem concerning outdoor movement of vehicles and pe- 
destrians can be considered as mostly being solved. However, pedestrian 
navigation in building interiors has been facing increased interest in the 
recent years. I n comparison to the outdoor case, it is complicated by multi- 
ple factors: Firstly, many applications of indoor positioning demand for 
much higher accuracies than outdoor applications. Secondly, global posi- 
tioning systems are unavailable which results either in the necessity for 
expensive infrastructure or the use of less accurate infrastructure inde- 
pendent positioning approaches. Thirdly, map matching as used to support 
outdoor positioning is hindered by the fact that pedestrian movement is 
generally less constraint than vehicle movement and, finally, the avail ability 
of indoor models is heavily dependent on the owner of the respective build- 
ings. However, most infrastructure independent indoor positioning ap- 
proaches need models at least as support information (eg. Walder & Ber- 
noulli 2010) or depend completely on them (Link etal. 2011). 



I n order to overcome this lack of indoor models, we present our approach 
for the automatic reconstruction of coarse indoor models from photo- 
graphed evacuation plans. Its general feasibility from plans in a single well- 
known layout has already been shown in (Peter et al. 201]). Here, we will 
present the main idea of the approach while concentrating on improve- 
ments enabling its generalization to arbitrary plans. Additionally, we will 
show how the resulti ng coarse i ndoor model may be ref i ned by reconstruct- 
ing door openings from an analysis of user tracks derived from a foot- 
mounted MEMS I ML) positioning system employing the well-known Zero- 
Vel oci ty- U pdate approach . 

The indoor models reconstructed from photographed evacuation plans can 
be erroneous or incomplete. To make our mapping approach robust against 
the quality of the input data, we enrich the data driven approach by object 
knowledge. Building interiors are subject to numerous geometric and topo- 
logical conditions: inner walls are often either rectangular or parallel to 
outer walls; rooms can be adjacent but not overlapping etc. Following these 
conditions, the construction of interiors can be traced back to basic archi- 
tectural principles and, thus, is appropriate to be described in a formal, 
rule-based way. A powerful means to facilitate this is using formal gram- 
mars. I n the paper, we wi 1 1 present our f i rst devel opments of a formal rul e- 
based descri pti on for i nteri ors. 



2. Reconstruction of coarse 2D and 3D Models from 
Photographed Evacuation Plans 

Evacuation plans area compulsory inventory for all publicly used buildings 
in most countries. 



Flucht- und Rettungswegeplan 




Figure L Example evacuation plan taken from (Wikipedia 2012) 



Apart from additional information like heading, legend, address and behav- 
ioral rules for emergency cases, their main content is a detailed plan of the 
interior building structure in the viewer's vicinity (see Figure 1). This plan's 
level of detail may differ to a great extent between different buildings. The 
plan will contain the locations of installations for use in emergency cases 
(fire extinguishers, fire alarm buttons etc.) as well as evacuation routes, 
while additional information like the positions of doors etc. is not always 
included. 

2.1 Approach Overview 

A general outline of our processing pipeline sounds as follows: Firstly, the 
input image is enhanced by operations correcting the color balance and 
brightness differences. Secondly, the image is divided into foreground and 
background using adaptive threshold binarization. The binary image still 
contains evacuation symbols which have to be detected and removed. The 
structures in the resulting cleaned binary image are thinned in order to de- 
rive the skeleton which is used in a specialized valorization step to bridge 
the symbol areas. In order to enable the use of metric thresholds in this 
step, the outer contour can be matched to an available model of the build- 
ing's outer shell, delivering the transformation parameters from image co- 
ordinates to real world coordinates. During the following reconstruction of 
the facets of the 2D model, apart from rooms also stairs and staircases may 
be detected. The number of detected stairs together with a common stair 
height results in an approximate room height which can be used to extrude 
the edges of the 2D model to walls in the final 3D model. 

2.2 Image Pre-Processing 

The image enhancement processing splits up into two steps: color balance 
and correction of brightness differences in the image. By color balance we 
try to ensure that white areas (namely the plan's background) will be white 
in the image. This is done in a similar way like described in (The GIMP 
Documentation Team 2011). In the histograms of the red, green and blue 
channels, respectively, the upper and lower 0.05% of the pixel values will be 
discarded and the histograms stretched accordingly. 

In order to correct still remaining brightness differences and to achieve a 
radiometric equalization between different images, a local contrast en- 
hancement is carried out using the white balanced image as input. For this 
correction the image is converted to the CIELab color space and the L- 
image is filtered using the Wallis filter (Wallis 1976). After converting the 
CIELab image back to RGB, these corrections affect the full color space. 
Figure 2 depicts a possible result of the complete enhancement in compari- 
son to the ori gi nal i mage. 



Figure 2. LHS: original image, RHS: white balanced and Wall is filtered image 
2.3 Binarization and Symbol Detection 

Binarization of the image is greatly facilitated by the fact that evacuation 
plans are designed for optimal human legibility even in extreme situations. 
If the plan at hand at least partly follows the design guidelines stated in 
(ISO 2009), this results in a white background, black ground plan elements 
and symbols in signal colors. Thus, a simple adaptive threshold with a big 
block size is suitablefor all color balanced images. The result of this opera- 
tion is depicted in Figure 3, left hand side. 

The detection of evacuation symbols in the photographed plan is not only 
necessary because these symbols may hide parts of the ground plan struc- 
tures to be reconstructed. Furthermore, the contained evacuation infor- 
mation should be provided in the reconstructed model and symbols like 
arrows may contain important further information I ike the down or up di- 
rection of staircases. 

A first approach for the solution of this problem using well-known symbols 
and cross- correlation template matching was already presented in (Peter et 
al. 2011). A more general method is the use of color segmentation for sym- 
bol detection. Even though most currently available plans do not follow the 
ISO standard (ISO 2009) completely, colors are generally used to distin- 
guish symbolsfrom other plan elements. 




Figure 3. LHS: Binary image derived from the plan depicted in Figure 2; RHS: 
Binary image with symbols overlaid (detected using colour segmentation approach) 



We use this fact and the Color Structure Code (Priese & Sturm 2003) 
combined with thresholds for the expected signal colors (pure red, green, 
blue and yellow). The resulting detected symbol areas can be seen in Figure 
3, right hand side. If the symbols contained in the plan are known, this col- 
or detection approach may also be used to constrain the search space for a 
further classification using tempi ate matching. 

2.4Geo-referencing by Matching to External Building Shell 

In order to enable the use of metric thresholds for the symbol bridging step 
as well as to use the final model for navigation purposes, the transformation 
between image coordinates and real world coordinates has to be found. 
Therefore, the outer contour of the reconstructed indoor model is matched 
to an existing outer shell (eg. the ground plan from OpenStreetMap). As 
the level of detail in both these shapes may differ to a great extent (see Fig- 
ure 4, red contours) and the reconstructed indoor model may be incom- 
plete, this results in a matching problem between two possibly incomplete 
shapes with unknown scaling and unknown relative orientation. 

For the solution of this problem, firstly, the outer contour in the binary im- 
age is selected and the external shell is scaled and translated to fit to the 
binary image's dimensions. Secondly, both polygons are generalized using a 
2D version of the 3D building generalization approach presented by (Kada 
et al. 2009). The result of this operation is depicted in Figure 4 (blue con- 
tours). 

Then, the angle between adjacent edges and the ratio of their lengths is 
computed for every node of these generalized polygons. Using these two 
features, similar node pairs are identified and, for every pair, the matched 
nodes and the adjacent edges' other ends deliver the parameters of an ap- 
proximate affine transformation. The preferable parameter set among these 
candidates should fulfill these two conditions: minimum shape modifica- 
tion and maximum overlap of the transformed source polygon with the des- 
tination polygon. The first condition is checked by comparing the per- node 
adjacent edges' length ratio of the transformed polygon to the state before, 
while the maximum overlap is computed following the approach presented 
by (Kada etal. 2009). 

However, the selection using this approach may be imperfect due to sym- 
metric input models (see rectangular generalized shape in Figure 4). In this 
case, the correct transformation may be selected using the comparison of 
the area overlap between the original models as a further descriptive fea- 
ture. 



Figure 4. Original (red) and generalized (blue) versions of the outer contour in the 
image (LHS) and the external shell fromOpenStreetMap(RHS) 

2.5 Symbol Bridging and Final 2D Model 

I n order to produce a complete reconstruction of the floor plan depicted in 
the evacuation plan, the areas previously covered by symbols have to be 
bridged. For this purpose, the skeleton of the cleaned binary image is de- 
rived using the iterative thinning approach presented by (T. Y. Zhang & 
Suen 1984). Vectorizing this skeleton image delivers end nodes as part of 
the contained topological information. 

For the actual symbol bridging the edges ending in at I east one end node are 
prolonged until a length threshold is exceeded, not taking the distance trav- 
eled on symbol areas into account and stopping at structures in the binary 
image. These prolonged edges are classified into two classes: having a big- 
ger overlap with red symbolsthan the red symbol size ("long edges") or not. 
We expect red symbols to produce most occlusions, as these represent safe- 
ty equipment which is mounted to the walls. According to the classification, 
long and other edges will be treated differently in the last completion step: 
Firstly, all long edges ending in a structure in the binary image are accept- 
ed; secondly, all other long edges are reconstructed up to their last intersec- 
tion point with other prolonged edges; thirdly, all other edges will be recon- 
structed up to their first intersection point or structure in the binary image, 
respectively. To ensure the reconstruction of two-sided walls, the validated 
edges wi 1 1 be pai nted i nto the cl eaned bi nary i mage usi ng the stroke wi dth 
of the edge they prolonged (computed using (Epshtein etal. 2010)). 

The contours extracted from the updated binary image are then used to 
reconstruct the facets of the 2D model, which represent hallways, rooms 
and stairs. Figure 5 shows the final 2D model, transformed to world coordi- 
nates and drawn as overlay in OpenStreetM ap. 




Figure 5. Reconstructed 2D model as overlay in OpenStreetM ap (© 
OpenStreetM ap contributors, Data CC-By-SA) 

2.6 3D Model 

I n order to derive an approximate predominant room height usable for the 
extrusion of the 2D wall edges to 3D wall facets, the reconstructed facets of 
the2D model are further analyzed and stair candidates are identified. 

Therefore, the maximum width of each "room" polygon is computed using 
the distance transformation (Felzenszwalb & Huttenlocher 2004) and all 
polygons not wider than 0.3m are selected. The length of each stair candi- 
date is computed as the sum of their skeleton's pixels (plus a tolerance), 
converted to a metric value, and only those candidates longer than 0.8m are 
kept. 




Figure 6. Automatically derived 3D model (stairs not explicitly reconstructed at this 
stage) 

Grouping neighboring single stair candidates will either cause stair candi- 
dates to be removed (if they have no neighbors) or deliver staircases (which 



may still be incomplete due to formerly overlapping symbol regions and 
therefore subject to a completion step). The combination of the maximum 
stair number of all staircases and a standard stair height (like stated in (E. 
Neufert et al. 2002)) leads to an approximate floor height. Using this floor 
hei ght, the 2D model may be extruded to three di mensi ons I i ke depi cted i n 
Figure 6. 



3. Model Refinement by the Analysis of User Tracks 

3.1 Indoor Positioning using a foot-mounted MEMS IMU 

For a further refinement of these coarse models, we employ position tracks 
of a user which are acquired by the readings of a foot- mounted MEMS IMU 
and processed with the zero velocity update approach presented by (Foxlin 
2005). For the correction of small positioning errors still remaining due to 
not eliminated drift errors or initial misalignment, we employ our align- 
ment of walked straight lines to one of the main axes of the building's ex- 
ternal shell (I ike presented in (Peter et al. 2011), seeFigure7). 

However, this positioning approach only delivers coordinates relative to a 
known initial position and initial orientation. In order to determine those 
values when starting the positioning process in the building, we propose a 
further analysis of the evacuation plan. The position of the user while he 
photographs the plan is marked by a symbol which can be found either us- 
ing template matching or as a symbol unique in shape or color. Using the 
transformation parameters derived during the model reconstruction step, it 
may be transformed to world coordinates. 

In order to compute the initial orientation, i.e. the camera's pose during the 
image acquisition, the approach presented in (Z. Zhang & He2007) and the 
corners of the paper containing the pi an are employed. For the computation 
of the final orientation, we use the fact that the plan has to be oriented ac- 
cording to the local environment if it follows (I SO 2009). 

3.2 Model Refinement 

Whilethe positions derived by using such approaches are often employed to 
guide the user in a pedestrian navigation scenario, the user's position may 
also be utilized to acquire information needed for the refinement of the un- 
derlying indoor model. 

One example is the geo- referencing of semantic information acquired by 
user interaction. Here, we have investigated the derivation of room num- 
bers and people assigned to a room from photographed door plates using 
Optical Character Recognition (OCR). However, by analyzing the user 



tracks in the context of the coarse model, they may also be used for an up- 
date of the model's geometric features by a fully automated derivation of 
door positions. Here, we employ the fact that the user is not able to pass 
through a door in arbitrary angles. Thus, if an average person's position 
track hits a wall in the model at an angle between 40° and 140°, a door will 
be reconstructed. I mplicitly, this constraint provides us with a simple map- 
matching solution, correcting the track whenever a wall is hit at angles dif- 
ferent from that (see F i gure 7) . 



4. Grammar for Indoor Modeling 

As the presented reconstruction approaches implicate a number of poten- 
tial sources of error depending on the quality of the observation data (e.g. in 
the symbol bridging, track alignment and map-matching steps), means for 
their support should be found. Therefore, we started the development of a 
formal grammar which is able to store geometric, topological and semantic 
information on building interiors. Generally, in a formal grammar, object 
knowledge is represented through symbols and a set of production or re- 
placement rules. The symbols are called non-terminals if they can be re- 
placed by other symbols, and terminals otherwise. The non-terminal sym- 
bol which defines the starting point for all replacements is the axiom. The 
production rules state the replacement of predecessor symbols by successor 
symbols. By successively applying rules to the axiom, new sequences of 
symbols are generated. 

For several years formal grammars have been applied successfully for mod- 
eling geometric structures. While (Prusinkiewicz & Lindenmayer 1990) fo- 
cus on line structures by eg. simulating growth processes of plants through 




Figure 7. LHS: Coarse model, uncorrected (red) and aligned (yellow) tracks; RHS: 
refined model containing reconstructed door and map- matched track (green) 



Lindenmayer-systems (L-systems), (Muller et al. 2006) show examples for 
the grammar- based reconstruction of building shells. (Groger & Plumer 
2010) present a grammar for indoor modeling, however, without the possi- 
bility to i ntegrate erroneous observati on data. 

In contrast, our grammar for modeling building interiors is designed in 
such way that it can be derived automatically from observation data in or- 
der to reflect knowledge on individual building interiors. 

4.1 Conceptual Considerations 

The first step in developing a formal description for indoor models is to 
identify basic geometric primitives as well as characteristic topologic prop- 
erties by which an indoor model can be described. For now, we restrict our- 
selves to f I oors and room conf i gurati ons. 

While the arrangement of floors is linear and, thus, shows a one- 
dimensional topology, the topology of room configurations within a floor is 
two-dimensional and, therefore, much more complex. However, such ar- 
rangements are not created by random compositions of walls, but follow 
architectural principles and are subject to functional restrictions. 
Knowledge about architectural principles and geometric and topological 
restrictions helps to understand indoor geometries and detect semantic 
relationships between them. I n order to provide as much support as possi- 
ble to indoor mapping and modeling, our indoor grammar is designed to be 
able to reflect such knowledge. The following properties of building interi- 
ors are crucial for our grammar design: 

1 In order to ensure convenient access to the rooms, buildings - 
in particular public buildings - are usually traversed by a sys- 
tem of corridors. 

2. The system of corridors divides each floor into corridor- 
spaces and non-corridor-spaces. Non-corridor-spaces can be 
further divided into smaller room units which are mostly ar- 
ranged i n a I i near sequence paral I el to the adj acent corri dor. 

3. Depending on their function, such room units feature specific 
layouts. For example, in hotels or hospitals atypical room unit 
consists of a bedroom and a bathroom. 

Properties land 2 give reasons for the application of two different modeling 
strategies: The course of the corridors, on the one hand, reminds of a net- 
work-like propagation of linear structures. The layout of the rooms, on the 
other hand, can be efficiently generated by a spatial partitioning applied to 
the interspaces of the corridor network. The analogy to 3D city modeling 



becomes apparent: The network of streets corresponds to the network of 
corridors, and the segmentation of the regions lying between the streets 
into parcels is comparable to the partitioning of non-corridor-spaces into 
rooms. This analogy motivates the adaption of grammar- based concepts 
used for city modeling. Similar to (Mull er 2001) who developed a L-system 
for modeling streets, we will use a specially designed L-system for the for- 
mal description of the corridor network (which is part of the ongoing work 
and will not be discussed in this paper). For the modeling of room layouts 
we develop a separate grammar which ismainly based on split operations. 

The grouping of functionally related rooms to superior room units, as men- 
tioned in property 3, is an important semantic information which the 
grammar should be able to represent. This requirement can be met by de- 
scribing characteristic room configurations through the grouping of split 
operations to structure-generating production rules. Such a structure- 
generating grouping of split operations implies that the splits have to be 
carried out in a certain sequential order, which can also be interpreted as 
giving different priori ties to the splits or rather the resulting partition walls. 
We assign high priority to walls which are incident with two opposite main 
walls. With the term main wall we denote walls which are not the result of 
spl i t operati ons I i ke outer bui I di ng wal I s or the boundary wal I s of corri dors. 
Each partition wall which is incident with less than two main walls is as- 
signed a low priority. Splits that produce walls of high priority are carried 
out first. To illustrate, Figure 8 shows a real floor plan, in which walls of 
high priority are marked in red, and walls of low priority in green. 
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Figure 8. Floor plan of a hotel in Zurich with high priority-walls (red) and low 
priority- walls (green). 



4.2 Grammar for the Modeling of Room Configurations 

Our grammar G indoor =(N,T,S,R) is designed to be applied to empty non- 
corridor- spaces in order to install room configurations. It comprises the 
non-terminals N, theterminalsT, the axiom S, and the production rules R. 

The non-terminals and terminals of our grammar correspond to basic geo- 
metric primitives. The set of non-terminals N consists of the symbols S, 
Space and Face. S is the axiom which represents an empty non-corridor- 
space. The symbol Space stands for an arbitrary 3D solid which can be fur- 
ther divided. Analogously, Face represents a 2D wall that can still be de- 
composed in wall segments. The terminals T describe solids or walls that 
are not divisible any further. To distinguish from non-terminals, the termi- 
nal symbols space and face start with lower case. Both non-terminals and 
terminals have attributes. They determine the space's or face's geometry 
and type. 

The production rules R are defined as replacement rules that perform a 
split, a merge or an instantiation. A split divides a Space into two Space 
elements along a partition plane. A merge is the inverse operation combin- 
ing two adjacent Space elements to one. The application of split and merge 
rules follows the principle of cell decomposition which automatically pro- 
vides knowledge about neighborhood relationships between the spaces and 
faces. Furthermore, topologically correct reconstruction results are en- 
sured. An instantiation rule replaces a non-terminal by a terminal symbol, 
and, thus, creates an instance of the respective geometry. Beside parame- 
ters, each rule has additional semantic rules, which basically comprise ge- 
ometri c and topol ogi c constrai nts as wel I as f uncti ons for the deri vati on and 
setting of attribute values. However, for lack of space within this paper, se- 
mantic rules will not be discussed, here. 

I n total, we distinguish between six different types of rules which are listed 
in the following. Although the listed rules focus on the application on the 
non-terminal Space, they are also valid for the application to the non- 
terminal Face. 



■ R 



SingleSplit. 



Space -> Splir pace (a,d) 



with Split pace (a,d) = Space/^, Space„ g /, t 



■ R 



RepeatSplit. 



Space RepeatSplir pace (Split ) 



■ R 



StringSplit. 



Space — > StringSplit pace (sequence) 

with sequence := Split a , Splits Split,, ... 



■ R 



MultiSplit. 




Split 



Split 



■ R Merge : Space /e/t , Spacer ^Merge SpaceJett ' Space - right () 

with Merge SpaceJett ' Space - righ ^ 



RuletypeSingleSplit performs a single split operation by replacing the non- 
terminal Space by the non-terminals Spacer and Spacer- Spacer and 
Spacer are the result of the function Split s P ace (a, d). The superscript 
"Space" denotes that the split operation is applied to the non-terminal 
Space. Orientation and position of the corresponding partition plane is de- 
scribed by the rule parameters: orientation angle a and the distance value 
d. aandd refer to a local coordinate system which is based on the Space to 
split. 

Rule types RepeatSplit and StringSplit can be used to store knowledge 
about linear sequences of split operations that have high priority (see sec- 
tion 4.1). While RepeatSplit generates a sequence of identical room units by 
repeating a single split operation, StringSplit is able to produce a sequence 
of different room units. Split operations of low priority - applied for model- 
ing non-linear room layouts - can be aggregated within the rule type Mul- 
tiSplit. Since, in this case, the split operations cannot be represented in se- 
quential order, they need to be represented withi n a graph structure. 

By means of the four split rules, the merge and the instantiation rule, each 
processed evacuation plan can be transferred to a specialized rule system 
which contains detailed knowledge about the construction of the building's 
interior. Based on simple examples, Figure 9 shows how the six rule types 
can be geometr i cal I y i nter preted . 
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Figure 9. Different ruletypesand their geometric interpretation. 

To verify the practicability of our grammar rules, we applied them to a real 
floor plan (Figure 10) for which a 3D model has already been derived (see 
Figure 7). As can be seen, the six rules types are sufficient to express even 
complex room layouts. 
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Figure 10. Rule-based description for a real floor plan. 



5. Conclusion and Outlook 



I n the future, we will continue the development in terms of the L-system 
representing the corridor network and the semantic rules which will 
compl ete the grammar. 

The information about the individual building's interior structure stored in 
the acquired grammar will enable a further support of the positioning 
technique as well as the reconstruction both from the photographed plans 
and the user tracks. Additionally, we will investigate the possibility of 
reconstructing a model of the indoor environment using only user tracks 
and constraints from the grammar, simultaneously refining the grammar. 
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